H1B-KV: Hybrid One-Bit Caches for Memory-Efficient Large Language Model Inference
arxiv.org·2d
💨Cache Optimization
Neural Networks from Scratch in Python: Simpler Than You Think
hamza.se·1h·
Discuss: Hacker News
📊Quantization
An enough week
blog.mitrichev.ch·1d·
🧮Z3 Solver
YouTube gets ~5% CTR lift on Shorts by replacing embedding tables with Semantic IDs
shaped.ai·22h
📊Feed Optimization
Explicit Lossless Vertex Expanders!
gilkalai.wordpress.com·12h
💎Information Crystallography
[P] Lossless compression for 1D CNNs
reddit.com·11h·
📊Quantization
Activation Alchemist: Sculpting Stability with Functional Signatures
dev.to·2h·
Discuss: DEV
🔍Concolic Testing
RND1: Simple, Scalable AR-to-Diffusion Conversion
radicalnumerics.ai·1d·
Discuss: Hacker News
💻Local LLMs
Contrastive Weak-to-strong Generalization
arxiv.org·18h
Information Bottleneck
Why Your Simple Password Is a Mathematical Catastrophe
tawandamunongo.dev·1d·
Discuss: Hacker News
🔐Hash Functions
Sorting encrypted data without decryption: a practical trick
dev.to·7h·
Discuss: DEV
🔐Hash Functions
Integral Signatures of Activation Functions: A 9-Dimensional Taxonomy and Stability Theory for Deep Learning
arxiv.org·18h
🧠Machine Learning
Doing Math with Embeddings for Better AI Ad Targeting
ethicalads.io·1d·
Discuss: Hacker News
📊Feed Optimization
In-Depth Analysis: "Attention Is All You Need"
dev.to·6h·
Discuss: DEV
🧠Intelligence Compression
Neuro-Symbolic AI
en.wikipedia.org·7h·
Discuss: Hacker News
🔲Cellular Automata
Optimal Stopping in Latent Diffusion Models
arxiv.org·18h
🧠Machine Learning
MARC: Memory-Augmented RL Token Compression for Efficient Video Understanding
arxiv.org·18h
🧠Learned Codecs
Why Low-Precision Transformer Training Fails: An Analysis on Flash Attention
huggingface.co·1d·
Discuss: Hacker News
📊Learned Metrics
Exponential Error Bounds for Information Bottleneck Source Coding Problems
arxiv.org·18h
📐Compression Bounds